Try building rustc with N codegen units #81214

tmiasko · 2021-01-20T17:33:22Z

r? @ghost

tmiasko · 2021-01-20T18:54:34Z

PR CI x86_64-gnu-llvm-9 took 77m (usually it takes ~40m).

@bors try @rust-timer queue

rust-timer · 2021-01-20T18:54:35Z

Awaiting bors try build completion.

bors · 2021-01-20T18:54:47Z

⌛ Trying commit ed6643a390a37d29cb0e8cf323bf3c46cf3a8c9e with merge ad1e2edcc154dde638611f902f9e504f5a8d9c6f...

bors · 2021-01-20T20:14:03Z

☀️ Try build successful - checks-actions
Build commit: ad1e2edcc154dde638611f902f9e504f5a8d9c6f (ad1e2edcc154dde638611f902f9e504f5a8d9c6f)

rust-timer · 2021-01-20T20:14:05Z

Queued ad1e2edcc154dde638611f902f9e504f5a8d9c6f with parent a4cbb44, future comparison URL.

@rustbot label: +S-waiting-on-perf

rust-timer · 2021-01-20T22:29:11Z

Finished benchmarking try commit (ad1e2edcc154dde638611f902f9e504f5a8d9c6f): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

tmiasko · 2021-01-23T23:52:07Z

The size of unpacked toolchain was also reduced from 404M to 354M (as installed by rustup-toolchain-install-master with default options).

Now with two codegen units (PR CI x86_64-gnu-llvm-9 took 63m).

@bors try @rust-timer queue

rust-timer · 2021-01-23T23:52:08Z

Awaiting bors try build completion.

bors · 2021-01-23T23:52:21Z

⌛ Trying commit 4f9f69512c7c3ff4fdfa298c28f2d339015f0b2d with merge 4c4da1d578ee027b619ce11c645e6de512d26927...

bors · 2021-01-24T01:14:18Z

☀️ Try build successful - checks-actions
Build commit: 4c4da1d578ee027b619ce11c645e6de512d26927 (4c4da1d578ee027b619ce11c645e6de512d26927)

rust-timer · 2021-01-24T01:14:19Z

Queued 4c4da1d578ee027b619ce11c645e6de512d26927 with parent 4d0dd02, future comparison URL.

@rustbot label: +S-waiting-on-perf

rust-timer · 2021-01-24T03:33:08Z

Finished benchmarking try commit (4c4da1d578ee027b619ce11c645e6de512d26927): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

bjorn3 · 2021-01-24T06:33:25Z

Huge improvements of up to 15.5%.

tmiasko · 2021-01-24T12:07:36Z

With three codegen units (PR CI x86_64-gnu-llvm-9 took 44m).

@bors try @rust-timer queue

rust-timer · 2021-01-24T12:07:37Z

Awaiting bors try build completion.

bors · 2021-01-24T12:07:46Z

⌛ Trying commit 7848a41345dcccb14b68adaee9526ebd2117c6be with merge 01e13b863b2b389489961bf7431b7a315264a43e...

bors · 2021-01-24T13:14:21Z

☀️ Try build successful - checks-actions
Build commit: 01e13b863b2b389489961bf7431b7a315264a43e (01e13b863b2b389489961bf7431b7a315264a43e)

rust-timer · 2021-01-24T13:14:22Z

Queued 01e13b863b2b389489961bf7431b7a315264a43e with parent 85e355e, future comparison URL.

@rustbot label: +S-waiting-on-perf

rust-timer · 2021-01-24T15:53:23Z

Finished benchmarking try commit (01e13b863b2b389489961bf7431b7a315264a43e): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

tmiasko · 2021-02-11T15:46:38Z

With rust.codegen-units=4 (PR CI x86_64-gnu-llvm-9 took 44 min):

@bors try @rust-timer queue

rust-timer · 2021-02-11T15:46:39Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-02-11T15:46:49Z

⌛ Trying commit 935f7819e5068b1d43cd9dbe904eb626d2961987 with merge bb215a40be948264811fbac2da595208a74e625c...

bors · 2021-02-11T16:47:36Z

☀️ Try build successful - checks-actions
Build commit: bb215a40be948264811fbac2da595208a74e625c (bb215a40be948264811fbac2da595208a74e625c)

rust-timer · 2021-02-11T16:47:37Z

Queued bb215a40be948264811fbac2da595208a74e625c with parent 2918062, future comparison URL.

rust-timer · 2021-02-11T19:41:35Z

Finished benchmarking try commit (bb215a40be948264811fbac2da595208a74e625c): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

tmiasko · 2021-02-11T21:22:02Z

With rust.codegen-units=5 (PR CI x86_64-gnu-llvm-9 took 42 min):

@bors try @rust-timer queue

rust-timer · 2021-02-11T21:22:04Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-02-11T21:22:12Z

⌛ Trying commit f8b67ff with merge 03192e87b7e4444897b09bec2729882ed234cbdd...

bors · 2021-02-11T22:22:39Z

☀️ Try build successful - checks-actions
Build commit: 03192e87b7e4444897b09bec2729882ed234cbdd (03192e87b7e4444897b09bec2729882ed234cbdd)

rust-timer · 2021-02-11T22:22:41Z

Queued 03192e87b7e4444897b09bec2729882ed234cbdd with parent e9920ef, future comparison URL.

rust-timer · 2021-02-12T00:37:41Z

Finished benchmarking try commit (03192e87b7e4444897b09bec2729882ed234cbdd): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

tmiasko · 2021-02-14T15:13:16Z

The table updated with new measurements is in #81214 (comment)

@Mark-Simulacrum what would be next steps? Is that sufficient to make a decision, or would you like me to repeat try builds once again?

bjorn3 · 2021-02-14T15:16:43Z

I am curious what the effect of only building libstd with a single codegen unit would be.

tmiasko · 2021-02-14T15:18:47Z

The standard library is already built with a single codegen unit.

Mark-Simulacrum · 2021-02-14T15:34:01Z

I might be misunderstanding the table, but by my reading it looks like switching to bootstrapping with codegen units affecting only rustc has actually introduced some pretty major regressions - maybe something has changed on master in the meantime?

I guess I am not feeling like we have a complete picture of the constraints here and the effects of various switches. Bootstrap columns in the table seem pretty useless to me -- they're not indicative of a difference between variant 1 and 2 in terms of performance of rustc at runtime.

There are several inputs which can be changed:

number of codegen units used for rustc
number of codegen units used for std
number of codegen units used for tooling like cargo etc (in theory this has no effect, but really we should validate that assumption)

From each of these we have several metrics to evaluate the effect:

time spent on CI producing these artifacts
size of produced artifacts on disk
runtime performance of the artifacts, likely ideally represented as "how close we get to optimal, with these toggles"
- likely this wants to be roughly "average change" and "worst regression", but ultimately just needs to be a link to perf.rlo

I think we don't want to alter the settings when perf.rlo is run, as that skews our data and generally doesn't make for something that's easy to compare. That means any changes in configuration in this PR should be gated on Rust's CI (somehow) rather than just any x.py run.

Right now the table I think I'm struggling to read the table in a conclusive way - I don't know exactly what would help, but it seems like the primary problem with the current table is that it combines data along these axes in a way that at least for me is hard to compare. Maybe splitting apart into multiple tables or putting each metric in its own table would be helpful, I'm not sure.

tmiasko · 2021-02-14T17:56:59Z

it looks like switching to bootstrapping with codegen units affecting only rustc has actually introduced some pretty major regressions - maybe something has changed on master in the meantime?

As far as I can see that is mostly a reflection to what degree optimizer decision made when building rustc affect the results, and how inaccurate benchmarks are as a consequence. After spending some time profiling I am entirely unsurprised given that: 50% of all instructions in inflate are in just one function, for keccak in 2 functions, for match-stress-enum in 2 functions, for ctfe-stress-4 around 10. Just like in match-stress-enum case it should be easy to isolate and direct optimizer towards making the same decisions it is making now, if one desired to do so.

Given that reducing the number of CGUs for tooling takes around 10 min longer in try build ("Try 1" vs "Try 2"), that building with one or two CGUs takes at least 10 min longer ("Try 1" & "Try 2") than other options, and that builds with one, two, three, and four CGUs have regressions in perf results. What about reducing the number of CGUs for rustc to 5, which reduces the size of toolchain by 8 MB, took 54 min in try builder and gave following perf results?

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 20, 2021

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jan 20, 2021

tmiasko force-pushed the 1 branch from ed6643a to 4f9f695 Compare January 23, 2021 22:45

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 24, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 24, 2021

tmiasko force-pushed the 1 branch from 4f9f695 to 7848a41 Compare January 24, 2021 11:06

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 24, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 11, 2021

tmiasko force-pushed the 1 branch from 9eacd69 to 935f781 Compare February 11, 2021 14:40

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 11, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 11, 2021

Try building rustc with codegen-units=5

f8b67ff

tmiasko force-pushed the 1 branch from 935f781 to f8b67ff Compare February 11, 2021 20:31

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 11, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 12, 2021

Mark-Simulacrum removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 12, 2021

tmiasko closed this Feb 27, 2021

tmiasko deleted the 1 branch February 27, 2021 11:11

Try building rustc with N codegen units #81214

Try building rustc with N codegen units #81214

Conversation

tmiasko commented Jan 20, 2021

tmiasko commented Jan 20, 2021

rust-timer commented Jan 20, 2021

bors commented Jan 20, 2021

bors commented Jan 20, 2021

rust-timer commented Jan 20, 2021

rust-timer commented Jan 20, 2021

tmiasko commented Jan 23, 2021

rust-timer commented Jan 23, 2021

bors commented Jan 23, 2021

bors commented Jan 24, 2021

rust-timer commented Jan 24, 2021

rust-timer commented Jan 24, 2021

bjorn3 commented Jan 24, 2021

tmiasko commented Jan 24, 2021

rust-timer commented Jan 24, 2021

bors commented Jan 24, 2021

bors commented Jan 24, 2021

rust-timer commented Jan 24, 2021

rust-timer commented Jan 24, 2021

tmiasko commented Feb 11, 2021

rust-timer commented Feb 11, 2021

bors commented Feb 11, 2021

bors commented Feb 11, 2021

rust-timer commented Feb 11, 2021

rust-timer commented Feb 11, 2021

tmiasko commented Feb 11, 2021

rust-timer commented Feb 11, 2021

bors commented Feb 11, 2021

bors commented Feb 11, 2021

rust-timer commented Feb 11, 2021

rust-timer commented Feb 12, 2021

tmiasko commented Feb 14, 2021

bjorn3 commented Feb 14, 2021

tmiasko commented Feb 14, 2021

Mark-Simulacrum commented Feb 14, 2021

tmiasko commented Feb 14, 2021